Structural Tags, Annealing and Automatic Word Classiication

نویسندگان

  • J Mcmahon
  • F J Smith
چکیده

We describe an automatic word classiication system which uses a locally optimal annealing algorithm and average class mutual information. A new word-class representation, the structural tag is introduced and its advantages for use in statistical language modelling are presented. A summary of some results with the one million word lob corpus is given; the algorithm is also shown to discover the vowel-consonant distinction and displays an ability to cluster words syntactically in a Latin corpus. Finally, a comparison is made between the current classiication system and several alternative systems, which shows that the current system performs tolerably well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structural Tags, Annealing and Automatic Word Classification

This paper describes an automatic word classiication system which uses a locally optimal annealing algorithm and average class mutual information. A new word-class representation, the structural tag is introduced and its advantages for use in statistical language modelling are presented. A summary of some results with the one million word lob corpus is given; the algorithm is also shown to disc...

متن کامل

Statistical Language Processing Based on Self-organising Word Classiication

An automatic word classi cation system has been designed which processes word unigram and bigram frequency statistics extracted from a corpus of natural language utterances. The system implements a type of simulated annealing which employs an average class mutual information metric. Resulting classi cations are hierarchical, allowing variable class granularity. Words are represented as structur...

متن کامل

Tags Re-ranking Using Multi-level Features in Automatic Image Annotation

Automatic image annotation is a process in which computer systems automatically assign the textual tags related with visual content to a query image. In most cases, inappropriate tags generated by the users as well as the images without any tags among the challenges available in this field have a negative effect on the query's result. In this paper, a new method is presented for automatic image...

متن کامل

Automatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach

In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...

متن کامل

Diatom Identi cation: a Double Challenge Called ADIAC

This paper introduces the project ADIAC (Automatic Diatom Identiication and Classiication), which started in May 1998 and which is nanced by the European MAST (Marine Science and Technology) programme. The main goal is to develop algorithms for an automatic identiication of diatoms using image information: both valve shape (contour) and ornamentation. The paper presents the goals of the project...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994